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ABSTRACT 

Making a linguistic theory is like making a programming 
language: one typically devises a type system to delineate 
the acceptable utterances and a denotational semantics to 
explain observations on their behavior. Via this connection, 
the programming language concept of delimited continua- 
tions can help analyze natural language phenomena such as 
quantification and polarity sensitivity. Using a logical meta- 
language whose syntax includes control operators and whose 
semantics involves evaluation order, these analyses can be 
expressed in direct style rather than continuation-passing 
style, and these phenomena can be thought of as computa- 
tional side effects. 
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and features — control structures; J. 5 [Linguistics] 
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1. INTRODUCTION 

This paper is about computational linguistics, in the sense 
of applying insights from computer science to linguistics. 
Linguistics strives to scientifically explain empirical obser- 
vations of natural language. Semantics, in particular, is con- 
cerned with phenomena such as the following. In (f ) below, 
some sentences to the left entail their counterparts to the 
right, but others do not. 
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(1) Every student passed h Every diligent student passed 
No student passed h No diligent student passed 

A student passed K A diligent student passed 
Most students passed K Most diligent students passed 

The sentence in (2) is ambiguous between at least two read- 
ings. On one reading, the speaker must decline to run any 
spot that fails to substantiate any claims whatsoever. On 
another reading, there exist certain claims (anti-war ones, 
say) such that the speaker must decline to run any spot 
that fails to substantiate them. 

(2) We must decline to run any spot that fails to substan- 
tiate certain claims.^ 

Finally, among the four sentences in (3), only (3a) is accept- 
able. That is, only it can be used in idealized conversation. 
The unaccept ability of the rest is notated with asterisks. 

(3) a. No student liked any course. 

b. *Every student liked any course. 

c. *A student liked any course. 

d. *Most students liked any course. 

The linguistic entailments and non-entailments in (f) are 
facts about English, in that only a speaker of English can 
make these judgments. Nevertheless, they presumably have 
to do with corresponding logical entailments and non-en- 
tailments: both the English speaker who judges that Ev- 
ery student passed entails Every diligent student passed and 
the Mandarin speaker who judges that Meige xuesheng dou 
jige-le entails Meige yonggong-de xuesheng dou jige-le rely 
on knowing that, if every student passed, then every diligent 
student passed. Thus the typical linguistic theory specifies 
a semantics for natural language by translating declarative 
sentences into logical statements with truth conditions. The 
linguistic entailments in (f) hold, goes the theory, because 
the meanings — truth conditions — of the two sentences are 
such that any model that verifies the former also verifies the 
latter. Much work in natural language semantics aims in 
this way, as depicted in Figure f, to explain the horizon- 
tal by positing the vertical. This approach is reminiscent 
of programming language research where an ill-understood 
language (perhaps one with a complicating feature like ex- 

^This sentence is part of a statement made by the cable 
television company Comcast after its CNN channel rejected 
an anti-war commercial hours before it was scheduled to air 
on January 28, 2003. 



Every student passed h 

I 

Vx. student(a;) => passed (x) h 
I 

(some truth condition on models) h 
Figure 1: The translation/denotation 

ccptions) is studied by translation into a simpler language 
(without exceptions) that is better understood. 

The translation target posited in natural language seman- 
tics is often some combination of the A-calculus and pred- 
icate logic. For example, the verb passed might be trans- 
lated as \x. passed (a::). This paper argues by example that 
the translation target should be a logical metalanguage with 
first-class delimited continuations. The examples are two 
natural language phenomena: quantification by words like 
every and most in (1), and polarity sensitivity on the part 
of words like any in (3) . 

Quantification was first analyzed explicitly using contin- 
uations by Barker (2002). Building on that insight, this 
paper makes the following two contributions. First, 1 ana- 
lyze natural language in direct style rather than in continual 
tion-passing style. In other words, the logical metalanguage 
used here is one that includes control operators for delim- 
ited continuations, rather than a pure A-calculus in which 
denotations need to handle continuations explicitly. Natu- 
ral language is thus endowed with an operational semantics 
from computer science that is richer than just /Jiy-rcduction. 

Second, I propose a new analysis of polarity sensitivity 
that improves upon prior theories in explaining why No stu- 
dent liked any course is acceptable but *Any student liked 
no course is not. This analysis crucially relies on the notion 
of evaluation order from programming languages, thus elu- 
cidating the role of control effects in natural language and 
supporting the broader claim that linguistic phenomena can 
be fruitfully thought of as computational side effects. 

The rest of this paper is organized as follows. In §2, 
I introduce a simple grammatical formalism. In §3, I de- 
scribe the linguistic phenomenon of quantification and show 
a straw man analysis that deals with some cases but not 
others. I then introduce a programming language with de- 
limited continuations and use it to improve the straw man 
analysis: quantification in non-subject position is treated 
in §4, and inverse scope is covered in §5. In §6, I turn to the 
linguistic phenomenon of polarity sensitivity and show how 
a computationally motivated notion of evaluation order im- 
proves upon previous analyses. In §7, 1 place these examples 
in a broader context and conclude. 

2. A GRAMMATICAL FORMALISM 

In this section, I introduce a simple grammatical formal- 
ism for use in the rest of the paper. It is a notational variant 
of catcgorial grammar (as introduced by Carpenter (1997; 
chapter 4), for instance). 

The verb like usually requires an object to its right and a 
subject to its left. 

(4) a. Alice liked CS187. 

b. *Alice liked. 

c. *Alice liked Bob CS187. 



Every diligent student passed 

I 

Vx. (student(x) A diligent(a;)) passed(x) 

I 

(some other truth condition on models) 
approach to natural language semantics 

Intuitively, like is a function that takes two arguments, and 
the sentences (4b-c) are unacceptable due to type mismatch. 
We can model this formally by assigning types to the denotar 
tions of Alice, CS187, and liked, which we take to be atomic 
expressions. 

(5) [Alice] = alice : Thing 

(6) |[CS187l = csl87 : Thing 

(7) pikedl = liked : Thing Thing Bool 

Here Thing is the type of individual objects, and Bool is the 
type of truth values or propositions. Following (justifiable) 
standard practice in linguistics, we let liked take its object as 
the first argument and its subject as the second argument. 
For example, in (4a), the first argument to liked is csl87, 
and the second argument is alice. 

As (4a) shows, there axe two ways to combine expressions. 
A function can take its argument either to its right (combin- 
ing liked with CS187) or to its left (combining Alice with 
liked CS187). We denote these two cases with two infix op- 
erators: "/" for forward combination and "\" for backward 
combination. (The tick marks depict the direction in which 
a function "leans on" an argument.) 

(8) f > X = f{x) : P where f : a —> P, x : a 

(9) x \ f = f{x) : (3 where f : a —>■ (3, x : a 

We can now derive the sentence (4a) — that is, prove it to 
have type Bool. The derivation can be written as a tree (10) 
or a term (11). 

(10) 

Alice 

liked CS187 

(11) [Alice] \ ([liked] / [CS187]) = liked csl87 alice : Bool 

By convention, the infix operators / and \ associate to the 
right, so parentheses such as those in (11) above are optional. 

Unfortunately, the system set up so far derives not only 
the acceptable sentence (4a) but also the unacceptable sen- 
tence (12), with the same meaning. 

(12) *Alice CS187 liked. 

The reason the system derives (12) is that the direction of 
function application is unconstrained: in the derivation be- 
low, liked takes its first (object) argument to the left, which 
is usually disallowed in English. 

(13) 

Alice 

CS187 liked 

(14) [Alice] \ [CS1871 \ piked] = liked csl87 alice : Bool 

To rule out this derivation of (12) in our type system, we 
split the function type constructor "— >" into two type con- 



structors and "^"^ one for each direction of appli- 

cation. Using these new type constructors, we change the 
denotation of liked to specify that its first argument is to its 
right and its second argument is to its left. 

(15) [liked] = liked : Thing -A Thing Bool 

We also revise the combination rules (8) and (9) to require 
different function type constructors. 

(16) f / X : P where f : a ^ (3, x : a 

(17) x \ f : (3 where f : (3, x : a 

The system now rejects (12) while continuing to accept (4a), 
as desired. 

3. QUANTIFICATION 

The linguistic phenomenon of quantification is illustrated 
by the following sentences. 

(18) a. Every student liked CS187. 

b. Some student liked every course. 

c. Alice consulted Bob before most meetings. 

As with the previously encountered sentences, the natural 
language semanticist wants to translate English into logical 
formulas that account for entailment and other properties. 

More precisely, the problem is to posit translation rules that 
map these sentences thus. For instance, we would like to 
map (18a) to a formula like 

(19) Va;. student(a;) =^ x \ liked / csl87 : Bool, 
where the constants 

(20) V : (Thing -^ Bool) -» Bool, =^ : Bool -^ Bool -^ Bool 

are drawn from the (higher-order) abstract syntax of predi- 
cate logic. To this end, what should the subject noun phrase 
every student denote? Unlike with Ahce, there is nothing of 
type Thing that the quantificational noun phrase every stu- 
dent can denote and still allow the desired translation (19) 
to be generated. At the same time, we would like to retain 
the denotation that we previously computed for the verb 
phrase liked CS187, namely liked / csl87. Taking these con- 
siderations into account, one way to translate (18a) to (19) 
is for the determiner every to denote 

(21) |every| = A'r. As. Va;. r{x) => a; \ s 

: (Thing Bool) ^ (Thing ^ Bool) ^ Bool. 

Here the restrictor r and the scope s are A-bound variables 
intended to receive, respectively, the denotations of the noun 
student (of type Thing Bool) and the verb phrase liked 
CS187 (of type Thing ^ Bool). (More precisely, r and s are 
A'-bound variables; the tick mark again signifies the direction 
of function application.) In a non-quantificational sentence 
like (4a), the verb phrase takes the subject as its argument; 
by contrast, in the quantificational sentence (18a), the sub- 
ject takes the verb phrase as its argument. 

Extended with the lexical entry (21) for every, and as- 
suming that student denotes 

(22) [student] = student : Thing -> Bool, 



the grammar can derive the sentence (18a). 

(23) ^<C^^^>\ 
every student liked CS187 

(24) ([every] / [student]) / [liked] / [08187] = (19) 

The existential determiner some can be analyzed similarly: 

let some denote 

(25) [some] = A'r. As. 3x. r{x) Ax \ s 

: (Thing ->■ Bool) ^ (Thing ^ Bool) Bool 

to derive the sentence Some student liked CS187. 



(26) 




some student liked CS187 

(27) ([some] / [student]) / [liked] / [CS187] 

= 3a;. student(a;) A x \ liked / csl87 : Bool 

To summarize, we treat determiners like every and some 
as functions of two arguments: the restrictor and the scope 
of a quantifier, both functions from Thing to Bool. Such 
higher-order functions are a popular analysis of natural lan- 
guage determiners, and have been known to somanticists 
since Montague (1974) as generalized quantifiers. However, 
the simplistic account presented above only handles quan- 
tificational noun phrases in subject position, as in (18a) but 
not (18b) or (18c). For example, in (18b), neither forward 
nor backward combination can apply to join the verb liked, 
of type Thing Thing ^ Bool, to its object every course, of 
type (Thing ^ Bool) —f Bool. Yet, empirically speaking, the 
sentence (18b) is not only acceptable but in fact ambiguous 
between two available readings. This problem has prompted 
a great variety of supplementary proposals in the linguistics 
literature (Barwise and Cooper 1981; Hendriks 1993; May 
1985; inter alia). The next section presents a solution using 
delimited continuations. 

4. DELIMITED CONTINUATIONS 

First-class continuations represent "the entire (default) fu- 
ture for the computation" (Kelsey, Clinger, Rees et al. 1998). 
Refining this concept, Felleisen (1988) introduced delimited 
continuations, which encapsulate only a prefix of that future. 
This paper uses shift and reset (Danvy and Filinski 1989, 
1990), a popular choice of control operators for delimited 
continuations. 

To review briefly, the shift operator (notated ^) captures 

the current context of computation, removing it and making 
it available to the program as a function. For example, when 
evaluating the term 

(28) lOx (e/.l + /(2)), 

the variable / is bound to the function that multiplies every 
number by 10. Thus the above expression evaluates to 21 via 
the following sequence of reductions. (The reduced subex- 
pression at each step is underlined.) 

(29) [10x( C/.l + /(2) )] 

> [l + {Xv.[10xv]){2)] 

>[! + [10 X 2] ] > [1 + > [1 + 20] > [21] > 21 



Term reductions are performed deterministically in applica- 
tive order: call-by-value and left-to-right. 

The reset operator (notated with square brackets [ ]) de- 
lineates how far shift can reach: shift captures the current 
context of computation up to the closest dynamically en- 
closing reset. Hence "3 x" below is out of reach. 

(30) [3x[10x( C/.l+/(2) )]1 

> [3 X [l + {\v.\W xv]){2) ]] 

> [3 X [1 -h [10 X 2] ]] > [3 X [1 + _[20]_]] > • • • > 63 

Shift and reset come with an operational semantics (illus- 
trated by the reductions above) as well as a denotational 
semantics (via the CPS transform). That both kinds of se- 
mantics are available is important to linguistics, because 
the meanings of natural language expressions (studied in se- 
mantics) need to be related to how humans process them 
(studied in psycholinguistics) . 

Quantificational expressions in natural language can be 
thought of as phrases that manipulate their context. In a 
sentence like Ahce liked CS187 (4a), the context of CS187 
is the function mapping each thing x to the proposition that 
Alice liked x. Compared to the proper noun CS187, what is 
special about a quantificational expression like every course 
is that it captures its surrounding context when used. 

(31) Alice liked [every course]. 

Thus, loosely speaking, the meaning of the sentence (31) no 
longer has the overall shape a I ice \ liked / • • • once the occur- 
rence of every course is considered, much as the meaning of 
the program (28) no longer has the overall shape 10 x • ■ ■ 
once the shift expression is evaluated. Let us add shift and 
reset to the target language of our translation from English. 
We can then translate every course as 

(32) [every course] = ^s. Vx.course(a;) =J> s{x) : ThingB°°|. 

The type notation here indicates an a with a control 
effect; the CPS transform maps it to (a — > 7) — > 5. The 
denotation of every course behaves locally as a Thing, but 
requires the current context to have the answer type Bool 
and maintains that answer type. 

To sec the new denotation (32) in action, let us derive 
the sentence (31). The type of every course is ThingB°°|, 
similar to the type Thing of CS187, so the derivation of (31) 
is analogous to (10-11). 

(33) ^^^^ 
Alice ^^^^^^ 

liked every course 

(34) [[Aliccl \ llikcd] / [[every course|J 

= [alice V liked / ^s. Vx. course(x) => s{x)] 

> [Va. course(a:) {Xv. [alice \ liked / f])(a:)] > • • • 

> Va;. course(a;) =^ alice \ liked / x : Bool 

Like the straw man analysis in §3, the denotation in (32) 
generalizes to determiners other than every: we can abstract 
the noun course out of every course, and deal with some 
student similarly. 

(35) |evcry] = Ar. ^s. \/x. r{x) s{x), 

(36) |some| = Ar. ^s. 3x. r{x) A s{x) 

: (Things Bool) ^Thing^~| 



(We require here that the rcstrictor r have the type Things 
Bool, not a type of the form Things Booll^, so r cannot 
incur control effects when applied to x. Any control effect in 
the restrictor, such as induced by the quantificational noun 
phrase a company in the sentence Every representative of a 
company left, must be contained within reset.) 

More importantly, unlike the straw man analysis, the new 
analysis works uniformly for quantificational expressions in 
subject, object, and other positions, such as in (18a-c). In- 
tuitively, this is because shift captures the context of an 
expression no matter how deeply it is embedded in the sen- 
tence.^ By adding control operators for delimited continual 
tions to our logical metalanguage, we arrive at an analysis 
of quantification with greater empirical coverage. 

Figure 2 shows a logical metalanguage that formalizes the 
basic ideas presented above. It is in this language that 
denotations on this page are written and reduced. Refin- 
ing Danvy and Filinski's original shift-reset language, we 
distinguish between pure and impure expressions. An im- 
pure expression may incur control effects when evaluated, 
whereas a pure expression only incurs control effects con- 
tained within reset (Danvy and Hatcliff 1992, 1994; Nielsen 
2001; Thielecke 2003). This distinction is reflected in the 
typing judgments: an impure judgment 

(37) r h : 

not only gives a type a for E itself but also specifies two 
answer types 7 and 5. By contrast, a pure judgment 

(38) r\- E -.a 

only gives a type a for E itself. As can be seen in the Lift 
rule, pure expressions are polymorphic in the answer type. 

As mentioned in §2, the use of directionality in function 
types to control word order is not new in linguistics, but the 
use of delimited control operators to analyze quantification 
is. It turns out that we can tie the potential presence of 
control effects in function bodies to directionality. That is, 
only directional functions — those whose types are decorated 
with tick marks — are potentially impure; all non-directional 
functions we need to deal with, including contexts captured 
by shift, are pure. Another link between directionality and 
control effects is that the and ^E rules for directional 
function application are not merely mirror images of each 
other: the answer types 70 through 73 are chained differently 
through the premises. This is due to left-to-right evaluation. 

Having made the distinction between pure and impure 
expressions, we require in our Shift rule that the body of 
a shift expression be pure. This change from Danvy and 
Filinski's original system simplifies the type system and the 
CPS transform, but a shift expression E in their language 
may need to be rewritten here to [E]. 

The CPS transform for the metalanguage follows from 
the typing rules and is standard; it supplies a denotational 
semantics. The operational semantics for the metalanguage 
specifies a computation relation between complete terms; it 
is also standard and shown in Figure 3. 

The present analysis is almost, but not quite, the direct- 
style analogue of Barker's (2002) CPS analysis. Put in 
direct-style terms, Barker's function bodies are always pure, 
whereas function bodies here can harbor control effects. In 

^No matter how deep, that is, up to the closest dynamically 
enclosing reset. Control delimiters correspond to islands in 
natural language (Barker 2002). 
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Figure 2: A logical metalanguage with directionality and delimited control operators 



Values V ■.:=U\ Xx. E \ X\. E 

Unknowns U y- c \ UV \ U / V \ V \ U 

Contexts C{) ::= { ) \ {C{ ))E \ C{ ) / E \ C{ )\F 

\ViC{ ))\V>C( )\VsC{ ) 
Metacontexts D( )::=() | C{[D{ )]) 

Computations E > E' 

D{C{ {Xx.E)V )) \> D{C{E{x^^V})) 

D{C( iXx.E)/V )) > D{C{E{x ^ V})) 

D{C{ V\{>^x.E) }) > D{C{E{x ^ V})} 

D{C{\V])) > D{C{V)) 

DjCiU.E)) > D{E{! ^ Xx.[C{x)]}) 

Figure 3: Reductions for the logical metalanguage 

other words, function bodies are allowed to shift, as in the 
determiner denotations in (35) and (36). By contrast. Barker 
uses choice functions to assign meanings to determiners. 

5. QUANTIFffiR SCOPE AMBIGUITY 

Of course, natural language phenomena are never as sim- 
ple as a couple of programming language control operators. 
Quantification is no exception, so to speak. For example, 
the sentence Some student liked every course (18b) is am- 
biguous between the following two readings. 

(39) 3a;. student(a;) A Vt/. course(2/) =^ x \ liked / y 

(40) Vj/. course(j/) =^ 3a;. student(a;) A x \ liked / y 



In the surface scope reading (39), some takes scope over ev- 
ery. In the inverse scope reading (40), every takes scope 
over some. Given that evaluation takes place from left to 

right, the shift for some student is evaluated before the shift 
for every course. Our grammar thus predicts the surface 
scope reading but not the inverse scope reading. This pre- 
diction can be seen in the first few reductions of the (unique) 
derivation for (18b): 

(41) [dsomel / [student]) \ |liked] / [every] / [course]] 
= [ ((AV. ^s. 3a;. r{x) A s{x)) I student) 

\ liked / ((AV. ^s. Vy. r(y) ^ s{y)) I course)] 

> \ {is. 3x. student(a;) A s{x) ) 

\ liked / ((A'r. ^s. Vy. r{y) s{y)) I course)] 

> [3a;. student(x) A (Xv. [v 

\ liked / ((A'r. ^s. Vj/. r{y) =^ s{y)) r course)]) (x)] 

Regardless of what evaluation order we specify, as long as 
our rules for semantic translation remain deterministic, they 
will only generate one reading for the sentence. Hence our 
theory fails to predict the ambiguity of the sentence (18b). 

To better account for the data, we need to introduce some 
sort of nondeterminism into our theory. There are two natu- 
ral ways to proceed. First, we can allow arbitrary evaluation 
order, not just left-to-right. This change would render our 
term calculus nonconfluent , a result unwelcome for most pro- 
gramming language researchers but welcome for us in light of 
the ambiguous natural language sentence (18b). This route 
has been pursued with some success by Barker (2002) and 
de Groote (2001). However, there are empirical reasons to 



maintain Icft-to- right evaluation, one of which appears in §6. 

A second way to introduce nondctcrrninism is to main- 
tain left-to-right evaluation but generalize shift and reset to 
a hierarchy of control operators (Barker 2000; Danvy and 
Filinski 1990; Shan and Barker 2003), leaving it unspecified 
at which level on the hierarchy each quantificational phrase 
shifts. Following Danvy and Filinski, we extend our logical 
metalanguage by superscripting every shift expression and 
pair of reset brackets with a nonnegative integer to indicate 
a level on the control hierarchy. Level is the highest level 
(not the lowest). When a shift expression at level n is eval- 
uated, it captures the current context of computation up to 
the closest dynamically enclosing reset at level n or higher 
(smaller). For example, whereas the expression 

(42) [3x[10x(eV.l + /(2))l1° 
evaluates to 63 as in (30), the expression 

(43) [3x[10x(eV.l + /(2))]']° 

evaluates to 1 + 3 x 10 x 2, or 61. The superscripts can be 
thought of "strength levels" for shifts and resets. 

Danvy and Filinski (1990) give a denotational semantics 
for multiple levels of delimited control using continuations 
of higher-order type. We can take advantage of that work 
in our quantificational denotations (35-36) by letting them 
shift at any level. The ambiguity of (18b) is then predicted 
as follows. Suppose that some student shifts at level m and 
every course shifts at level n. 

(44) Some"* student liked every" course. 

If m < n, the surface scope reading (39) results. If rn > n, 
the inverse scope reading (40) results. In general, a quanti- 
fier that shifts at a higher level always scopes over another 
that shifts on a lower level, regardless of which one is eval- 
uated first. This way, evaluation order does not determine 
scoping possibilities among quantifiers in a sentence unless 
two quantifiers happen to shift at the same level. 

To suunnarize the discussion so far, whether we introduce 
nondeterministic evaluation order or a hierarchy of delimited 
control operators, we can account for the ambiguity of the 
sentence (18b), as well as more complicated cases of quantifi- 
cation in English and Mandarin (Shan 2003). For example, 
both the nondeterministic evaluation order approach and 
the control hierarchy approach predict correctly that the 
sentence below, with three quantifiers, is 5-way ambiguous. 

(45) Every representative of a company saw most samples. 

Despite the fact that there are three quantifiers in this sen- 
tence and 3! = 6, this sentence has only 5 readings. Because 
a company occurs within the restrictor of every representa- 
tive of a company, it is incoherent for every to scope over 
most and most over a. The reason neither approach gener- 
ates such a reading can be seen in the denotation of every 
in (35): "Vx" is located immediately above in the ab- 
stract syntax, with no intervening control delimiter, so no 
control operator can insert any material (such as mos£-quan- 
tification over samples) in between. 

There exist in the computational linguistics literature al- 
gorithms for computing the possible quantifier scopings of 
a sentence like (45) (Hobbs and Shiebcr 1987; followed by 
Lewin 1990; Moran 1988). Having related quantifier scoping 
to control operators, we gain a denotational understanding 



of these algorithms that accords with our theoretical intu- 
itions and empirical observations. 

An extended logical metalanguage with an infinite hierar- 
chy of control operators is shown in Figure 4. This system is 
more complex than the one in Figure 2 in two ways. First, 
instead of making a binary distinction between pure and im- 
pure expressions, we use a number to measure "how pure" 
each expression is. An expression is pure up to level n if it 
only incurs control effects at levels above n when evaluated. 
Pure expressions are the special case when n = 0. The pu- 
rity level of an expression is reflected in its typing judgment: 
a judgment 

(46) r\-E:a[n 

states that the expression E is pure up to level n. Here a\n 
is a computation type with n levels: as defined in the figure, 
it consists of 2""*"^ — 1 value types that together specify how 
a computation that is pure up to level n affects answer types 
between levels and n — 1. In the special case where n = 1, 
the computation type all is of the familiar form a*. 

In the previous system in Figure 2, directional functions 
are always impure (that is, pure up to level 1) while non- 
directional functions are always pure (that is, pure up to 
level 0). In the current system, both kinds of functions de- 
clare in their types up to what level their bodies are pure. 
For example, the determiners every and some, now allowed 
to shift at any level, both have not just the type 

(47) (Things Bool) A Thing^°°|i;:, 

y<.n 

but also the type 

(48) (Thing^Bool^?;;:)AThing^°5S 

72!n 

(but see the second technical complication below). As the 
argument type Things BooCji^ above shows, the first argu- 
ment to these determiners, the restrictor, is non-directional 
yet can be impure (that is, pure up to level n -|- 1). 

To traverse the control hierarchy, we add a new Reset rule, 
which makes an expression more pure, and a new Lift rule, 
which makes an expression less pure. (Consecutive nested 
resets like can be abbreviated to [EY^ without loss 

of coherence.) 

A second complication in this system, in contrast to Fig- 
ure 2, is that we can no longer encode logical quantification 
using a higher-order constant like V : (Thing— > Bool) — » Bool, 
because such a constant requires its argument — the logical 
formula to be quantified — to be a pure function. This re- 
quirement is problematic because it is exactly the impurity 
of quantified logical formulas that underlies this account of 
quantifier scope ambiguity. On one hand, we want to quan- 
tify logical formulas that are impure; on the other hand, we 
want to rule out expressions like 

(49) yx.^f.x, 

where the logical variable x "leaks" illicitly into the sur- 
rounding context. This issue is precisely the problem of clas- 
sifying open and closed terms in staged programming (see 
Taha and Nielsen 2003 and references therein): the types 
Thing and Bool really represent not individuals or truth val- 
ues but staged programs that compute individuals and truth 
values. For this paper, we adopt the simplistic solution of 
adjoining to these types a set of free logical variables for 
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Figure 4: Extending the logical metalanguage to a hierarchy of control operators 



tracking purposes, denoted p,q, . . . . Unfortunately, we also 
need to stipulate that these logical variables be freshly a- 
renamed ("created by gensym") for each occurrence of a 
quantifier in a sentence. 

As before, the CPS transform and reductions for this met- 
alanguage are standard; the latter appears in Figure 5. 

The present analysis is almost, but not quite, the direct- 
style analogue of Shan and Barker's (2003) CPS analysis, 
even though both use a control hierarchy. Each level in Shan 
and Barker's hierarchy is intuitively a staged computation 
produced at one level higher. More concretely, a computer- 
tion type with n levels in that system has the shape 



(50) 

rather than the shape 
(51) 



("71)51 



The issue above of how to encode logical quantification over 
impure formulas receives a more satisfactory treatment in 
Shan and Barker's system: no stipulation of a-renaming is 
necessary, because there is no analogue of (49) to prohibit. 
The relation between that system and staged programming 
with effects has yet to be explored. 

6. POLARITY SENSITIVITY 

Because the analysis so fax focuses on the truth-condi- 
tional meaning of quantifiers, it equates the determiners a 



and some — both are existential quantifiers with the type and 

denotation in (36). Furthermore, sentences like Has anyone 
arrived? suggest that the determiner any also means the 
same thing as a and some. To the contrary, though, the de- 
terminers a, some, and any are not always interchangeable 
in their existential usage. The sentences and readings in (52) 
show that they take scope differently relative to negation (in 
these cases the quantifier no). 

(52) a. No student liked some course, (unambiguous 3-i) 

b. No student liked a course. (ambiguous ^3, 3^) 

c. No student liked any course. (unambiguous -i3) 

d. Some student liked no course, (unambiguous 3^) 

e. A student liked no course. (ambiguous -i3, 3-i) 

f. *Any student liked no course. (unacceptable) 

The determiner any is a negative polarity item: to a first 
approximation, it can occur only in downward-entailing con- 
texts, such as under the scope of a monotonically decreasing 
quantifier (Ladusaw 1979). A quantifier q, of type (Thing— » 
Bool) — » Bool, is monotonically decreasing just in case 

(53) Vsi.Vs2. (Vx.S2(a:) => si(a;)) => q{si) => q(s2). 

The quantificational noun phrases no student and no course 
are monotonically decreasing since, for instance, if no stu- 
dent liked any course in general, then no student liked any 
computer science course in particular. 

Whereas any is a negative polarity item, some is a positive 
polarity item. Roughly speaking, some is allergic to down- 
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Figure 5: Reductions for the extended logical metalanguage 



ward-entailing contexts (especially those with an overtly 
negative word like no). These generalizations regarding po- 
larity items cover the data in (52a-e): in principle, goes the 
theory, all those sentences are ambiguous between two scop- 
ings, but the polarity sensitivity of some and any rule out 
one scoping each in (52a), (52c), (52d), and (52f). These 
four sentences are thus predicted to be unambiguous, but it 
remains unclear why (52f) is downright unacceptable. 

In the typc-thcoretic tradition of linguistics, polarity sen- 
sitivity is typically implemented by splitting the answer type 
Bool into several typos, each a different functor applied to 
Bool, that are related by subtyping (Bernardi 2002; Bernardi 
and Moot 2001; Fry 1999). For instance, to differentiate the 
determiners in (52) from each other in our formalism, we 
can add the types BoolPos and BoolNeg alongside Bool, such 
that both are supertypes of Bool (but not of each other). 



(54) 



Bool < BoolPos Bool < BoolNeg 



We also extend the subtyping relation between (value and 
computation) types with the usual closure rules, and allow 
implicit coercion from a subtype to a supertype. 



(55) 
(56) 
(57) 
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r\-E:l3[n 



Sub 



We then add a side condition to Reset, requiring that the 
produced answer type be Bool or BoolPos, not BoolNeg. 



(58) 



(59) 



r h S : 



rh[E]:p 



Reset where /? < BoolPos 



r\- E -.a 



l3<(n+l) 



r\-[E]: l3\i 



■ Reset where /3 < BoolPos 



Finally, we refine the types of our determiners from (47) to 



(60) 



|no] : (Thing Bool) Thing 



(61) [some] : (Thing ->■ Bool) Thing 



BoolNeg^!^ ' 

BoolPos^',';^ 
BoolPos*'," ' 



(62) 
(63) 



la] : (Thing- 
[any] : (Thing - 



Bool) Thing| 
Bool) Thing 



Bool'"," ' 

BoolNeg^ 
BoolNeg"* 



The chain of answer-type transitions from one quantificar 
tional expression to the next acts as a finite-state automa- 
ton, shown in Figure 6. The states of the automaton are 
the three supertypes of Bool; the e-transitions are the two 
subtyping relations in (54); and the non-e transitions are the 
determiners in (60-63). 




Figure 6: An automaton of answer-type transitions 

This three-state machine enforces polarity constraints as 
follows. Any valid derivation for a sentence assigns each 
of its quantifiers to shift at a certain level in the control 

hierarchy. For each level, the quantifiers at that level — in 
the order in which they arc evaluated — must form 

• either a path from BoolPos to Bool in the state ma- 
chine, in other words a string of determiners matching 
the regular expression "some* (a | no any*)*"; 

• or a path from Bool to Bool in the state machine, in 
other words a string of determiners matching the reg- 
ular expression "(a | no any*)*". 

Furthermore, the BoolPos-to-Bool levels in the hierarchy 
must all be higher than the Bool-to-Bool levels. Every as- 
signment of quantifiers to levels that satisfies these condi- 
tions gives a reading for the sentence, in which quantifiers 
at higher levels scope wider, and, among quantifiers at the 
same level, ones evaluated earlier scope wider. 

Consider now the two alternative ways to characterize 
scope ambiguity suggested in §5. The first approach is to 
allow arbitrary evaluation order (and use a degenerate con- 
trol hierarchy of one level only). If we take this route, 
we can account for all of the acceptability and ambiguity 



judgments in (52a-o), but wo cannot distinguish the aceept- 
able sentence (52c) from the unacceptable (52f). In other 
words, it would be a mystery how the acceptability of a sen- 
tence hinges on the linear order in which the quantifiers no 
and any appear. This mystery has been noted by Ladu- 
saw (1979; §9.2) and Fry (1999; §8.2) as a defect in current 
accounts of polarity sensitivity. 

The second approach, using a control hierarchy with mul- 
tiple levels, fares better by comparison.^ We can stick to 
left-to-right evaluation, under which — as desired — an any 
must be preceded by a no that scopes over it with no inter- 
vening a or some. Indeed, the variations in ambiguity and 
acceptability among sentences in (52) arc completely cap- 
tured. For intuition, we can imagine that the hearer of a 
sentence must first process the trigger for a downward-en- 
tailing context, like no, before it mates sense to process a 
negative polarity item, like any.* Intuition aside, the pro- 
gramming-language notion of evaluation order provides the 
syntactic hacker of formal types with a new tool with which 
to capture observed regularities in natural language. 

7. LINGUISTIC SIDE EFFECTS 

This paper outlines how quantification and polarity sen- 
sitivity in natural language can be modeled using delimited 
continuations. These two examples support my claim that 
the formal theory and computational intuition we have for 
continuations can help us construct, understand, and main- 
tain linguistic theories. To be sure, this work is far from the 
first time insights from programming languages are applied 
to natural language: 

• It has long been noted that the intcnsional logic in 
which Montague grammar is couched can be under- 
stood computationally (Hobbs and Rosenschein 1978; 
Hung and Zucker 1991). 

• Dynamic semantics (Groenendijk and Stokhof 1991), 
which relates anaphora and discourse in natural lan- 
guages to nondctcrrninism and mutable state in pro- 
gramming languages (van Eijck 1998), has been ap- 
plied to a variety of natural language phenomena, such 
as verb-phrase ellipsis (van Eijck and Prancez 1995; 
Gardent 1991; Hardt 1999). 

However, the link between natural language and continua- 
tions has only recently been made explicit, and this paper's 
use of control operators for a direct-style analysis is novel. 

The analyses presented here are part of a larger project, 
that of relating computational side effects to linguistic side 
effects. The term "computational side effect" here covers 
all programming language features where either it is unclear 
what a denotational semantics should look like, or the "ob- 
vious" denotational semantics (such as making each arith- 
metic expression denote a number) turns out to break refer- 
ential transparency. A computational side effect of the first 

^Although this paper uses Danvy and Filinski's control hi- 
erarchy, polarity sensitivity can be expressed equally well in 
Shan and Barker's system. 

*The syntactic distinction among the types Bool, BoolPos, 
and BoolNeg may even be semantically interpretable via the 
formulas-as-types correspondence, but the potential for such 
a connection has only been briefly explored (Bernardi and 
Nilsen 2001) and we do not examine it here. In this con- 
nection, Krifka (1995) and others have proposed on prag- 
matic grounds that determiners like any are negative polar- 
ity items because they indicate extreme points on a scale. 



kind is jumps to labels; one of the second kind is mutable 
state. By analogy, I use the term "linguistic side effects" 
to refer to aspects of natural language where either it is 
unclear what a denotational semantics should look like, or 
the "obvious" denotational semantics (such as making each 
clause denote whether it is true) turns out to break ref- 
erential transparency. Besides quantification and polarity 
sensitivity, some examples are: 



(64) a. Bob thinks Alice likes CS187. 

b. A man walks. He whistles. 

c. Which star did Alice see? 

d. Alice only saw Venus. 

e. The king of France whistles. 



(Intensionality) 
(Vaxiable binding) 
(Interrogatives) 
(Focus) 
(Presuppositions) 



To study linguistic side effects, I propose to draw an anal- 
ogy between them and computational side effects. Just 
as computer scientists want to express all computational 
side effects in a uniform and modular framework and study 
how control interacts with mutable state (Felleisen and Hieb 
1992), linguists want to investigate properties common to all 
linguistic side effects and study how quantification interacts 
with variable binding. Furthermore, just as computer scien- 
tists want to relate operational notions like evaluation order 
and parameter passing to denotational models like continu- 
ations and monads, linguists want to relate the dynamics of 
information in language processing to the static definition 
of a language as a generative device. Whether this analogy 
yields a linguistic theory that is empirically adequate is an 
open scientific question that I find attractive to pursue. 
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